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Abstract 

The extreme event statistics plays a very important role in the theory and 
practice of time series analysis. The reassembly of classical theoretical results 
is often undermined by non-stationarity and dependence between increments. 
Furthermore, the convergence to the limit distributions can be slow, requir- 
ing a huge amount of records to obtain significant statistics, and thus limiting 
its practical applications. Focussing, instead, on the closely related density of 
"near-extremes" - the distance between a record and the maximal value - can 
render the statistical methods to be more suitable in the practical applications 
and/or validations of models. We apply this recently proposed method in the 
empirical validation of an adapted financial market model of the intraday market 
fluctuations. 



1. Introduction 

One of the main challenges of quantitative finance has been to come up with 
models for stock returns that could reproduce the implied or historical distri- 
butions of asset prices, to both acquire knowledge on the underlying dynamics 
of price formation, and to consistently price and hedge derivative products. 
Refs. 0, 0, S HI are physicist-friendly references discussing the basics of such 
problems. Seen on a grosser level, finance shares many common features with 
the study of (unfortunately, not so well-defined) "complex" systems, such as the 
"random" nature of the phenomena and the absence of comprehensive and ex- 
haustive theories. The analysis of extreme events plays a pivotal role every time 
an addressed problem has a stochastic nature, since the rare extreme events 
can have rather strong or drastic consequences- making it widely useful in geol- 
ogy meteorology, as well as in financial economics [f|. Another motivation for 
studying extreme events in finance can be to account for the observed fat tails of 
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log-returns (deviation from the Normal distribution in the tails) of stock prices. 
Though the field of extreme statistics is very well-established as part of classical 
probability theory, its applications can be hindered easily by non-stationarity 
and, often, slow convergence to the expected results. It is well-known that non- 
stationarity is the most prevalent cause of anomalous behaviours in financial 
time series (see for example the discussion in Ref. @), and thus the application 
of extreme event statistics has to be used with caution or reservations in most 
cases of financial data. 

Focusing the analysis on a simple version of an existing financial model, we 
present how the recently defined simple concept of near-extreme distribution Q 
can be helpful when studying financial time series data. The poor performance 
(slow convergence) of the extreme values theory is not totally overcome in this 
complementary approach. The main aim of this paper is to qualitatively present 
the possible achievements of this theoretical method. Therefore we use simple 
standard statistical analyses (Kolmogorov-Smirnoff test and Q-Q plot) to sub- 
stantiate our results, rather than sophisticated analyses. The paper is organized 
as follows: Sec. 2 gives a brief review of the main results of the classical theory 
of extreme values, along with the discussion of the limitations of its application. 
It also discusses the concept of near-extreme distribution with a related formula. 
Sec. 3 gives a description of the financial datasets used, together with the ex- 
planation of how we model intraday stock returns @ for the demonstration of 
this approach. Finally, Sec. 4 is dedicated to the results, analyses, discussions 
and conclusion. 



2. The classical extreme values statistics (EVS) theory 

This theoretical field was born between the 1920s and '30s with the seminal 
works of von Mises @, Frechet ^ and Fisher & Tippett [13, and soon became 



a well-established part of classical statistics with the works of Gnedenko [12 1 
and Gumbel [r|. Refs. [13, EH are modern and comprehensive introductions of 
the subject. 

The cumulative distribution function F of the maximum xm of a finite set 
of TV iid values (xi,X2, ■ ■ ■ , Xn) distributed respecting the probability density 
function g, may be written as 

F(x M ) = G{x M ) N (1) 

where G{x) = J_ g(u)du is the cumulative distribution function of any x; 
sometimes g and G are called parent distributions. Note that here we will 
continually use only the term "maximum" bearing in mind the fact that it 
actually refers to both the possible extremes, since each observation regarding 
maximal values is equally true for minimal values. 

The theory states that, in the limit TV — > oo, the function F(xm) converges 
to either a Weibull, a Frechet or a Gumbel distribution, 

F(a N x + b N ) = G(a N x + b N ) N -> L(x), TV — > oo, (2) 
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for some suitable couple of weights ajv and 6jv (we will omit the index M from 
xm when the context is clear). What discriminates between the three cases of 
the limiting distribution L{x) is the behaviour of g when x tends to infinity, i.e. 
the tails. We do not discuss the details of the kind of convergence the theory 
predicts in different cases; the interested reader can assume the "conservative" 
choice of weak convergence (convergence in distribution). Moreover, we do not 
present an exhaustive description of the domain of attractions, but rather focus 
on the important cases with a language suited for practical applications. 

Weibull (Bounded function) 

If the positive support of g is bounded, then F converges to a Weibull dis- 
tribution Lw{x)- 

L w (x) = cxp(-(-xf), with P > 1. (3) 
The appropriate weights ajy and fejv are 

ajv = on — inf 1.x : 1 — G(x) < — | and = sup {a; : G(x) < 1} (4) 

and the parameter f3 is given by the behaviour of G when approaching the value 
w = G _1 (l): G(x) ~ (w — x)P for x — > w. If w is reached with an exponent 
smaller than unity, then the case is degenerate with a limiting distribution 
L d (x) = 8(x - w). 

Frechet (Power-law behaviour) 

When g presents a power law behaviour G(x) ~ x~ a (for x — > oo and for 
a > 0), the extremal- value distribution approaches the Frechet distribution 
L F {x): 

L F (x) = exp(-x- a ). (5) 
In this case the weights are given by 

ajv = inf |x : 1 — G(x) < -^j and 6yv = 0. (6) 

Gumbel (Unbounded exponential - or faster - behaviour) 

Finally, when the parent function g has an unbounded support and its be- 
haviour at infinity is exponential (or faster) we recover the Gumbel distribution 
L G (x): 

L G (x) = exp(- exp (-#)). (7) 

The weights are given by 
a N = inf Ix : 1 - G(x) < — \ - b N and b N = inf Ix : 1 - G(x) < — \ . (8) 
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2.1. The limitations of EVS in straightforward applications 

Unfortunately, there are some serious limitations in the direct application of 
this theoretical structure. When dealing with financial time series we inherently 
work with non-stationary processes, which in this case can be translated into 
log-returns being non-identically distributed. In other words, ie, and Xj with 
i =/= j, do not have, in general, the same distribution. Moreover, often the 
convergence to the theoretical distribution is very slow and the extreme values 
cannot be assumed to be distributed according to L, if not for a very high value 



of the time series length N [16|, |17( . However we note that the hypothesis of 
independence between the elements of the analysed set can be weakened. For 
example, the same results still hold in the Gaussian case where the correlation 
between the x% and Xi+k goes to zero as fc -7 with 7 > 1, when k — > 00 IH 19j . 

For the illustration of the slow convergence, we use two important distri- 
butions: Gaussian and Tsallis ^-exponential. The Gaussian (standard Nor- 
mal) distribution is ubiquitous, and its definition or properties well-known, and 
hence the choice. The choice of the latter, because of many recent applications 



|2Q|; 1211 |22| and its gaining usefulness. For the Tsallis q-exponential distribution, 
we have 

g(x) = (1 + (q — l)^) 1 -* for x > with q > 1; (9) 

it can be alternatively defined for a larger range of the parameter <j, but we focus 
on this particular case. In econometrics, the same distribution is known as the 
generalized Parcto law, and its tail index is equal to 77 = 1/(9 — 1). Thus, its 
limiting distribution is the Frechet one. Fig. [1] depicts the extreme distribution 
for both Gaussian and g-exponcntial variables, obtained using simple pseudo 
random generation. From the figure, it is clearly evident that even for N in the 
order of thousands, the difference between finite size theoretical distributions 
and the theoretical limiting distributions is still large, and that the results of 
synthetic data are in fair agreement with the finite size distribution, as ought 
to be. The mismatch could arise from the weights a at and 6at; it is possible 
that the shape is already satisfactorily close to the limiting one, but this effect 
is not clear because of a slow convergence in the weights. However, there is 
no consistent way to discriminate the nature of the slow convergence, and to 
our knowledge no "finite size adjustments" for the weights are present in the 
literature. 

Finally, we must also add the pragmatic issue that arises from the discrete 
nature of prices in the market. Per se, this visible mis- match does not undermine 
the application of the theory, but in order to make the function "smooth" and 
observe a satisfactory convergence, a very large value N would be required 
again, enormous for the case in consideration. Fig. [2] shows the raw statistics of 
minimal values for market data, illustrating the issue. 

2.2. The near- extreme distribution 

The idea that helps to overcome the mentioned problems is to analyze a 
closely related distribution, the near-extreme distribution, recently proposed in 
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Figure 1: Two explicit examples of slow convergence to the limit distribution. Top: standard 
Normal distribution. Bottom: g-exponcntial distribution with index q = 1.3 (rj = —1/0.3, see 
text after Eq.|9{. The color lines are the empirical densities of maxima for synthetic variables 
respecting the two different parent distributions (the sample size of each statistics is 1000). 
The solid black lines are the limiting densities La{(x — biq) / o-n) / o,N and Lp((x — bjv ) /iiv ) /iJV 
and, finally, the black dashed lines are the theoretical finite sample densities. 
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Figure 2: Minimal values distributions for the MSFT data for r = 2500 and different values 
of N = 10, 25, 50, highlighting the effect brought by discretization. For the full explanation of 
the symbols, legend and parameters t, N please refer to Sec. [3] The bin size is equal to 10 — 4 . 

Ref. 0, and already applied to similar problems in Ref. 0. Roughly speak- 
ing, the near-extreme distribution is the distribution of the distance from the 
maximal value in a finite set. Considering again a finite set of N iid values 
(xi, X2, ■ ■ ■ , Xpj) distributed according to the parent density g (or its cumulative 
G), we call their maximum xm = max(xi, X2, ■ • ■ , xn)- According to Ref. 0], 
the empirical near-extreme density with respect to the maximum is defined as 

Pe(r,N) = j^-j S[r-(x M - Xi )], (10) 

where r is the distance measured from the maximum value xm, and Xm is not 
counted itself. Note that in order to obtain L°° p e (r,N)dr = 1 we are using a 
different normalization than in Ref. 0. 

Under the assumptions and with the current notations we have introduced 
above, we can obtain the following expression for the expected density 

/+oo 
Ng(x)G(x) N - 2 g(r - x)dx. (11) 
-oo 

This can be justified noticing that p(r, N) is the convolution of the density of 
the distance from a given maximum with the pdf of the maximum; the former 
can be written as g(x — Xm)/G(xm) and the latter as Ng(xM)G N ~ 1 (xM) (the 
derivative of Eq. [1]) . 

In Ref. 0, the authors describe the property of p(r,N) when N goes to 
infinity and show that near-extreme density converges to different limiting forms 
depending on the tail of the original distribution. However, we work here with 
finite sample N only and do not take interest in the limiting forms. 
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3. The data and model 

3.1. Data set 

The original data set consists of all trades registered in the primary markets 
of the analyzed stocks. The data are stored in the Thomson Reuters RDTH data 
base made available to the Chair of Quantitative Finance by BNP Paribas. For 
the purpose of our study, we extract from the RDTH database records consist- 
ing in the time of a transaction, the bid and ask prices prior to each transaction, 
and the traded price. These data, appropriately filtered in order to remove mis- 
prints in prices and times of execution, correspond to the trades registered at 
NYSE or at NASDAQ during 2007, for four shares of the Dow Jones Industrial 
Average Index at that time, namely: C.N, GE.N, INTC.O and MSFT.O. The 
C.N and GE.N were primarily traded at NYSE, while INTC.O and MSFT.O 
were primarily traded at NASDAQ. The full meaning of the symbols is available 
from www.reuters.com. The choice of one year of data is a trade-off between 
the necessity of managing enough data for significant statistical analyses and 
the goal of minimizing the effect of strong macro-economic fluctuations. How- 
ever, the consistency of the discussed results during extreme condition periods 
are beyond the purposes of the present paper, and are left for future studies. 
The stocks were chosen among the most active at that time, since, as already 
discussed, the size of the statistics can pose significant problems. We do not 
present results on a large set of different stocks because our main purpose is to 
show the approach rather than the final "fit" . An extensive study would need 
a formal definition of approximation rather than a qualitative and general one. 
Moreover, we are aware the method has to be finely tuned on the single stock 
characteristics (activity, price, particular trends, etc.) and the raw or straight- 
forward application of it could fail in general. For this reason we show the result 
on the simplest situations possible: Four similarly liquid stocks in the same in- 
dex and in the same economy (but not in the same sectors: Finance for C.N, 
general industry for GE.N and advanced technology for INTC.O MSFT.O). 

For each day the considered period is 10 : 00 — 15 : 45 hrs, precisely. The 
choice of the considered periods is to restrict the hours only to the central part 
of the trading day, discarding the opening and closing period. This is justified, 
since data often exhibit less "anomalies" during these parts of the trading day- 
errors tend to occur more often during the first and last part of the continuous 
trading day (it often happens that some shares are opened for trading several 
minutes after the others, due to potential issues during the opening auction). 

Note that we do not use "physical time" as our unit of measure, but rather 
consider "trading time" (a.k.a. "event time" or "tick time") [24]]. It is incre- 
mented each time an "event" occurs, that is each time the mid-price changes, 
i.e., each time either the bid or the ask price changes. In this way, we do not 
consider a trade leaving the mid-price unchanged as an event. As described 
in the following subsection, the final analysis is then performed by aggregating 
(summing) these events. 
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3.2. The intraday model and approximation idea 

Given Si, the price time series sampled in "event time", and a time-lag r, 
the log-returns rj = In may be modeled Q as a discrete-time stochastic 
process with a fluctuating variance: 

rj = atu, (12) 

where of is the local variance of the process, and are samples of a standard 
Normal distribution. Any drift for the returns may be neglected for the time 
scales under consideration. It is assumed that Oi is varying slowly enough, so 
that it can be treated as a constant over intraday time scales. Replacing <7j with 
its local constant value a, individual returns can be approximated as r\ « ere;. 
Eq.[T2]is a simplified version of ARCH-GARCH-like models used by the authors 
of Ref. § to fit market fluctuations using simple statistics of the squared returns. 
As often happen in econometrics this leads to a mixture of Normal distributions, 
helping us to model the non-stationarity and to overcome the tedious problem 



of evaluating the tail index of a candidate parent distribution G [25|, [26( . For the 
sake of clarity, we want to stress the fact we call "time-lag" or "time-window" 
the parameter r but it is not immediately connected with the physical time. 
In our studies we proceed as follows: 

• Fix a positive integer r, representing a time-lag. 

• From the original price time series St we extract the rjs defined as 

rj = log q — 

without allowing any overlap. 

• Arrange consecutive rjs in sets of length N, labeled with the index j that 
runs from 1 to h, such that h is the number of sets obtained and the 
total time series length T = h x N. Within each set j, we estimate cr| as 
the classical sample variance, and also evaluate the near-extreme statistics 
with respect to its maximum x J M , according to Eq. (jlOp . 

• The empirical near-extreme statistics is then given by the aggregation of 
each one of those statistics: 

N ) = I E wrr E s [ r (-m -«)]. (is) 

After some straightforward algebra and using our standing assumption we 
expect the near-extreme distribution to be 

P(r\N) = - J2 / Ng 3 {x)G 3 {x) N - 2 g 3 {r T - x)dx. (14) 

11 j= l J-oo 
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where 

9j {x) = AT(0, *j) and Gj{x) = Ml + erf (y^J 1 ■ (15) 
The corresponding cumulative distribution P is given by 



1 

3=1 



+ 00 

TV- 



P ( rT ) = X / Ng j (x)G i (x)»-' , G j (r- r x)dx. (16) 



In words, we get that the near-extreme distribution of log-returns over a time- 
window r can be obtained as a mixture of the corresponding near-extreme dis- 
tribution for h Gaussian variables, and that each considered set of length N 
brings a single element in the mixture. 

If the model and the underlying assumptions are valid we should observe an 
agreement between the expected "theoretical" distribution in Eq. [14] and the 
empirical distribution corresponding to Eq. 1131 



4. Results and discussion 





C max 


C min 


MSFT max 


MSFT min 


N=10 


0.493 


0.678 


0.923 


0.961 


N=25 


1.06 


0.784 


0.602 


1.137 


N=50 


1.23 


2.03** 


1.029 


0.595 




INTC max 


INTC min 


GE max 


GE min 


N=10 


1.524* 


2.320** 


0.560 


1.228 


N=25 


1.567* 


1.673** 


1.226 


0.706 


N=50 


0.954 


2.697** 


1.520* 


1.022 



Table 1: K-S statistics for the distributions depicted in Fig. \3\ One star stands for the fail 
of the test at the 5% significance level (critical value 1.333), two stars for the fail at 1% 
(critical value 1.625). The values confirm the quality of the fit for all cases but INTC, where 
we experience some problems, especially when analyzing the near extreme distribution with 
respect to the minima. 



C GE MSFT INTC 

N^IO 7946 7206 14496 14398 

N=25 8472 7679 15501 15380 

N=50 8624 7840 15826 15677 



Table 2: Sample sizes; the K-S statistics are obtained from the formula 
VSample size max \\P(r T ) — P e (r T )\\. 

In 2007, NYSE and NASDAQ had 251 days of open business and in the 
analyzed periods (see Sec. 13. lj) we isolated more than 22.4 [million] events for 
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Figure 3: Examples of empirical near extreme distribution vs. the expected theoretical 
distribution for the four analyzed stocks. The considered parameters are r = 2500 and 
N = 10,25,50. 
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Figure 4: Q-Q plots for the near-extreme statistics depicted in Fig. [3] The match between 
theoretical and empirical quantiles is satisfactory, except for some differences approaching 1 
(upper-right part of plots). The cases N = 25 and N = 50 have been shifted for reader 



C, 20.3 for GE, 40.4 for INTC and 40.8 for MSFT. Fixing r = 2500 we obtain 
more than 30 intraday returns for C and GE and around 65 for INTC and 
MSFT. With these parameters a set of N returns could be a bit longer than 
a day, but its components remain inherently intraday objects. Fig. [3] shows 
the main results obtained considering N = 10, 25, 50. The solid lines are the 
"theoretical" distributions as defined by the discussion in the previous sections 
(Eq. [T4l and the dots report the direct estimation of the near-extreme cdf. 

The agreement between the theoretical and empirical fits appear to be sat- 
isfactory, especially in the first half of the distributions, where the "near" of 
the term near-extreme actually takes place and it is remarkable how we can 
fit with a single distribution both near-maximum and near-minimum statis- 
tics. In some cases the prediction loses its power when the curve approaches to 
unity, i.e. when the opposite tail of the log-return distribution begins to play 
the main role; no longer the "near" -extreme ones, but actually a "far" -extreme 
distribution. As already mentioned, the main aim of this paper is the explo- 
ration of a new tool/idea, but nevertheless we can assess the significance of 
our result performing some simple statistical analyses. We choose to apply the 
Kolmogorov-Smirnov test (in Tab.[T]) and to show the Q-Q plots (in Fig.[4j. The 
K-S statistics is defined as D = max ||P(r T ) — P e (r T )|| and the null hypothesis is 
considered rejected at the 5% significance level, when -^/Sample size£> is larger 
than 1.333, and at the 1% significance level, when larger than 1.625 [27||. The 
only stock presenting systematic problem is INTC, where the fits fail especially 
when applied to the minimum case. All the other results are satisfactory. For 
completeness, Tab. [2] reports the sample sizes. The Q-Q plots highlight the 
good agreement of location, scale and skewness of the compared statistics and 
huge deviations from the straight line are observable only when the quantilcs 
approach unity. In Fig. @J plots for N = 25 and 50 have been shifted to facilitate 
readability. 

In general, we can say that the method gives satisfactory results for r in 
the range of a few hundreds to roughly ten thousand and for N between 10 to 
100. When r is too small, the local Gaussianity idea breaks down (given the 
discrete nature of the prices as shown in Fig. [5] and the strong dependencies) 
and when N is not large enough the estimations of the variances become too 
noisy. On the other hand, an excessively large value of those parameters (for a 
large N or r, h as defined in Sec. 13.21 becomes small) tend to give much more 
importance to each and every extreme value, and to give a poor statistics of 
variances. As a final remark we would like to clarify that our "estimator" for 
the distribution as defined in Eq.[T3]can be seen as a mixture of the distributions 
defined in Rcf . Q , exactly in the same fashion used in Ref . [1] to fit the intraday 
log-return distribution. 
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